Scalable Tucker Factorization for Sparse Tensors - Algorithms and Discoveries

نویسندگان

  • Sejoon Oh
  • Namyong Park
  • Lee Sael
  • U. Kang
چکیده

Given sparse multi-dimensional data (e.g., (user, movie, time; rating) for movie recommendations), how can we discover latent concepts/relations and predict missing values? Tucker factorization has been widely used to solve such problems with multi-dimensional data, which are modeled as tensors. However, most Tucker factorization algorithms regard and estimate missing entries as zeros, which triggers a highly inaccurate decomposition. Moreover, few methods focusing on an accuracy exhibit limited scalability since they require huge memory and heavy computational costs while updating factor matrices. In this paper, we propose P-TUCKER, a scalable Tucker factorization method for sparse tensors. P-TUCKER performs alternating least squares with a row-wise update rule in a fully parallel way, which significantly reduces memory requirements for updating factor matrices. Furthermore, we offer two variants of P-TUCKER: a caching algorithm P-TUCKER-CACHE and an approximation algorithm P-TUCKER-APPROX, both of which accelerate the update process. Experimental results show that P-TUCKER exhibits 1.7-14.1× speed-up and 1.4-4.8× less error compared to the state-of-the-art. In addition, P-TUCKER scales near linearly with the number of observable entries in a tensor and number of threads. Thanks to P-TUCKER, we successfully discover hidden concepts and relations in a large-scale real-world tensor, while existing methods cannot reveal latent features due to their limited scalability or low accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithms for Sparse Nonnegative Tucker Decompositions

There is a increasing interest in analysis of large-scale multiway data. The concept of multiway data refers to arrays of data with more than two dimensions, that is, taking the form of tensors. To analyze such data, decomposition techniques are widely used. The two most common decompositions for tensors are the Tucker model and the more restricted PARAFAC model. Both models can be viewed as ge...

متن کامل

Accelerating the Tucker Decomposition with Compressed Sparse Tensors

The Tucker decomposition is a higher-order analogue of the singular value decomposition and is a popular method of performing analysis on multi-way data (tensors). Computing the Tucker decomposition of a sparse tensor is demanding in terms of both memory and computational resources. The primary kernel of the factorization is a chain of tensor-matrix multiplications (TTMc). State-of-the-art algo...

متن کامل

Efficient tensor completion: Low-rank tensor train

This paper proposes a novel formulation of the tensor completion problem to impute missing entries of data represented by tensors. The formulation is introduced in terms of tensor train (TT) rank which can effectively capture global information of tensors thanks to its construction by a wellbalanced matricization scheme. Two algorithms are proposed to solve the corresponding tensor completion p...

متن کامل

Scalable Boolean Tensor Factorizations using Random Walks

Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more and more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstructed tensor is still binary. Such factorizations, called Boolean tensor factorizations, can provide ...

متن کامل

DFacTo: Distributed Factorization of Tensors

We present a technique for significantly speeding up Alternating Least Squares (ALS) and Gradient Descent (GD), two widely used algorithms for tensor factorization. By exploiting properties of the Khatri-Rao product, we show how to efficiently address a computationally challenging sub-step of both algorithms. Our algorithm, DFacTo, only requires two sparse matrix-vector products and is easy to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.02261  شماره 

صفحات  -

تاریخ انتشار 2017